XGBoost eXplainable AI

[1]:
import shap
import pandas as pd
from src.models import retrieve_fit_model as rfm

Retrieving latest most accurate XGBoost fit model

[2]:
fit_xgb_model = rfm.get_fit_mlflow_model('xgb')

Calculating Shapley values for fit XGBoost model

[3]:
def get_fit_model_shapley_values_and_explainer(fit_xgb_model):
  """Return a tuple with a list containing computed Shapley values from fit XGBoost model
  and the obtained TreeExplainer.

  Keyword arguments:
  fit_xgb_model -- Fitted XGBoost model
  """
  explainer = shap.TreeExplainer(fit_xgb_model)
  data_for_prediction = pd.read_csv('../../data/processed/processed_application_test.csv')
  shap_values = explainer.shap_values(data_for_prediction)
  return shap_values, explainer
[4]:
shap_values, explainer = get_fit_model_shapley_values_and_explainer(fit_xgb_model)
ntree_limit is deprecated, use `iteration_range` or model slicing instead.

Showing explainer base value

[5]:
explainer.expected_value
[5]:
-2.649124
[6]:
shap_values.shape
[6]:
(10468, 236)

Choosing line index to get explanations from

[7]:
line_index = 10

Extracting test dataset line from index

[8]:
test = pd.read_csv('../../data/processed/processed_application_test.csv')
line = test.iloc[line_index]

Vizualizing explanations for a single line in test dataset

[9]:
shap.initjs()
shap.force_plot(explainer.expected_value, shap_values[line_index], line)
[9]:
Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.

Vizualizing explanations for all lines in test dataset at once (subsample at 1000 lines)

[10]:
shap.force_plot(explainer.expected_value, shap_values[:1000], test.sample(1000))
[10]:
Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.

Vizualizing a summary plot for each class on the whole dataset (subsample at 1000 lines)

[11]:
shap.summary_plot(shap_values[:1000], test.sample(1000))
../../_images/files_notebooks_6.0-cbw-xgboost-xai_19_0.png